BMC Medical Informatics and Decision Making
Top medRxiv preprints most likely to be published in this journal, ranked by match strength.
Show abstract
BackgroundPredictive models employing machine learning algorithms are increasingly being used in clinical decision making, and improperly calibrated models can result in systematic harm. We sought to investigate the impact of class imbalance correction, a commonly applied preprocessing step in machine learning model development, on calibration and modelled clinical decision making in a large real-world context. MethodsA histogram boosted gradient classifier was trained on a highly imbalanced na...
Show abstract
BackgroundPeripheral artery disease (PAD) and chronic limb-threatening ischemia (CLTI) cause substantial morbidity and mortality, yet research progress is limited by fragmented, non-standardized data. The Observational Medical Outcomes Partnership (OMOP) Common Data Model (CDM) provides a standardized framework for electronic health record (EHR) research but lacks domain-specific detail for peripheral vascular diseases. This study aimed to develop and test a vascular-specific OMOP CDM extension ...
Show abstract
The increasing use of Real-World Data (RWD) in clinical research is critical for evidence-based decision making but presents challenges to data analytics. Unlike in Randomized Controlled Trials (RCTs), missingness in RWD occurs often and can include complex patterns which may or may not be Missing at Random (MAR). Informative absence (Missing Not at Random, or MNAR) occurs when the absence of data itself is a clinical signal. Applications of ad-hoc methods, or even popular "universal" methods, c...
Show abstract
BackgroundMachine learning models for perioperative mortality prediction show strong internal discrimination, yet external validation--particularly across continents--remains rare. Whether intraoperative vital sign features, which improve internal performance by 5-10%, transfer across populations is unknown. Furthermore, aggregate discrimination metrics may overstate clinical utility through Simpsons paradox if models separate risk strata without discriminating within them. MethodsWe conducted ...
Show abstract
The use of Electronic Health Records (EHRs) has increased significantly in recent years. However, a substantial portion of the clinical data remains in unstructured text formats, especially in the context of radiology. This limits the application of EHRs for automated analysis in oncology research. Pretrained language models have been utilized to extract feature embeddings from these reports for downstream clinical applications, such as treatment response and survival prediction. However, a thor...
Show abstract
BackgroundEarly prediction of in-hospital death remains a significant challenge due to the limited availability of structured data during initial admission. Unstructured clinical notes, which often contain important observations and impressions, are an underutilized resource for real-time risk stratification. While leveraging recent advances in large language models (LLM) is a promising approach to use this unstructured information, the lack of understanding of the uncertainty of LLM predictions...
Show abstract
Large language models (LLMs) are increasingly used in clinical settings, yet their effect on diagnostic accuracy of physicians has not been systematically quantified. We conducted a systematic review and meta-analysis of studies analyzing LLM-assisted diagnosis published between January 2020 and June 2025. Across 15 studies (43 effect sizes; 498 physicians; 7,274 case evaluations), LLM assistance significantly improved diagnostic accuracy compared to physicians without LLM support (Hedges g = 0....
Show abstract
BackgroundAlbuminuria is associated with increased risk of cardiovascular disease (CVD), heart failure, and progression of chronic kidney disease (CKD). Early detection of albuminuria, done through spot urine albumin creatinine ratio (UACR) testing, enables more accurate risk stratification and timely use of preventative therapies. It remains unacceptably low in the hypertension population. MethodsWe evaluated two EHR-embedded clinical decision support (CDS) strategies at Geisinger Health Syste...
Show abstract
PurposeNatural Language Processing (NLP) has the potential to extract structured clinical knowledge from unstructured Electronic Health Records (EHRs). However, the limited availability of annotated datasets for algorithm training restricts its application in clinical practice. This study investigates the use of transformer-based NLP models to structure Italian EHRs in cardiac settings, addressing this gap. MethodsWe implemented and evaluated three named entity recognition algorithms: SpaCy, Fl...
Show abstract
The project aimed to develop a data-driven approach for predicting platelet recovery in cancer treatment-induced thrombocytopenia (CTIT) patients receiving recombinant human thrombopoietin (Rh-TPO). By integrating key clinical indicators into a predictive modeling framework, the study sought to enhance understanding of individual treatment responses and facilitate timely clinical decision-making. A retrospective two-stage modeling analysis was conducted on 400 hospitalized CTIT patients who rece...
Show abstract
BackgroundIdentification of minimally invasive biomarkers of different stages of cachexia (Ca), and precachexia (PCa) in particular, might help clinicians in treating patients with pancreatic ductal adenocarcinoma (PDAC) at high risk of progressing to a more severe cachectic stage. In this work, we developed a machine-learning (ML) model optimized to blood biomarkers data that identifies precachectic and cachectic patients. MethodsBlood and clinical data was collected from treatment-naive patie...
Show abstract
Artificial intelligence models in healthcare often fail to improve patient outcomes despite strong predictive performance because they are frequently developed with limited understanding of clinical workflows and system implementation. We demonstrate a human-centered design approach to define prediction targets before model development, ensuring alignment with actionable clinical interventions. Using pediatric acute kidney injury as a case study, we convened a multidisciplinary working group and...
Show abstract
PurposeStudies based on electronic health records (EHR) often rely on structured data, which may incompletely capture important clinical phenotypes in EHR notes. The purpose of this study was to assess two natural language processing (NLP) tools to extract phenotypes from unstructured EHR notes, and to evaluate the added value of integrating NLP-derived phenotypes with structured EHR data at a health system scale. MethodsThis retrospective study is based on inpatient and outpatient EHR data fro...
Show abstract
BackgroundThe accuracy and safety of generating medication orders by large language models (LLMs) must be demonstrated. Without standardization, performance evaluation is limited to time and resource-intensive clinician grading. This evaluation aimed to develop a standardized medication format that supports automated performance evaluation (MedMatch). MethodsFirst, a survey of 40 medication prompts was given to clinicians to assess agreement in medication order communication. Second, a clinicia...
Show abstract
ObjectiveTo comprehensively evaluate the validity of ICD-10-CM codes for both prevalent diagnoses and less common diseases, and to assess the performance of a large language model (LLM)-based system in validating these codes. Materials and MethodsThis retrospective study analyzed hospital admissions from the Medical Information Mart for Intensive Care (MIMIC-IV) database. We developed a validated LLM-based system using GPT-4o, refined through iterative prompt engineering, to assess ICD-10-CM co...
Show abstract
Electronic health records (EHRs) provide a large source of data that can be used for research purposes. Extraction of information from unstructured clinical notes in EHRs can be automated by large language models (LLMs). Although LLMs are promising for this task, challenges remain in reliable application of LLMs to EHR, including the lack of development and validation for languages other than English. Here, we identified Dutch LLMs and compared their performance in a case study. We selected the ...
Show abstract
BackgroundIntegrating advanced artificial intelligence (AI) into clinical decision-support often requires the sharing of sensitive patient data with external services, raising privacy concerns. Homomorphic encryption (HE) allows computing directly on encrypted data, without revealing the underlying patient information. ObjectivesTo develop a large language model (LLM)-assisted diagnosis framework while preserving patient privacy in the clinical text analysis, by leveraging HE and using rare dis...
Show abstract
BackgroundTyping in the electronic health record (EHR) takes up healthcare providers time and cognitive space and constitutes a substantial administrative burden contributing to high burnout rates in healthcare. Ambient digital scribes may improve this problem. ObjectiveTo investigate the effect of the use of Autoscriber, an ambient digital scribe, on healthcare providers administrative workload and the quality of medical notes in the EHR. MethodsA study period of 26 weeks was randomized into ...
Show abstract
BackgroundThe use of large language models (LLMs) is increasing in the medical field; however, LLMs are often subject to "confabulations." Notably, LLMs have vulnerability to adversarial attacks, or fabricated details within prompts, which is concerning given both health misinformation and inadvertent errors in the medical record. This purpose of this study was to determine the effect of adversarial attacks by embedding one fabricated medication into a list of existing medicines. MethodsA total...
Show abstract
Medical errors are one of the leading causes of death in the United States. Several public databases have been built to record patient safety events across healthcare systems to better understand and improve safety hazards. These reports typically include both structured fields (e.g., event type, device, manufacturer) and unstructured data elements (free text narrative of what happened). The structured fields are usually restricted to a limited number of categories, whereas the unstructured fiel...